This answer sheet should be used for your VAST
Challenge 2017 Mini-Challenge 2 submission.
Please maintain the .htm format and make sure that all hyperlinks are
relative to the answer form.
Rename this form "index.htm" for your
submission. Remove these instructions
and any other example text below that is highlighted in yellow. Please see the
"Submission Instructions" at http://vacommunity.org/VAST+Challenge+2017
for more detailed instructions.
PRIMARY
Bindu
Gupta, bindu.gupta2@tcs.com
Kaushal
Paneri, kaushal.paneri@tcs.com
Gunjan
Sehgal, sehgal.gunjan@tcs.com
Karamjit
Singh, karamjit.singh@tcs.com Geetika
Sharma, geetika.s@tcs.com Gautam
Shroff, gautam.shroff@tcs.com Student Team: NOTools Used:
d3.js
HTML5
CSS
jQuery / Javascript
Python
Excel
Approximately how
many hours were spent working on this submission in total?
100 hours
May we post your
submission in the Visual Analytics Benchmark Repository after VAST Challenge
2017 is complete? YES
Video
Questions
MC2.1 – Characterize
the sensors’ performance and operation.
Are they all working properly at all times? Can you detect any unexpected behaviors of
the sensors through analyzing the readings they capture? Limit your response to no more than 9 images and
1000 words.
1. We computed the mean values for each chemical level at each sensor. We found that sensors 1 and 2 had lower means for all four chemicals. On further analysis, we discovered that the wind direction is least towards these two sensors
2. Sensor 3 and 4 have the highest mean for all chemicals. They record high levels for all chemicals.
3. We found that AGOC has multiple readings at the same time instant for the sensors 3,4,5,6,8 and 9. This indicates that these sensors do not work properly at all times.
4. In order to discover sensor anomalies we use a Gaussian model for each sensor. We compute the mean and standard deviation for all sensors at all locations for each hour. If the value of a sensor reading is outside the range mean +/- 2 times std deviation we tag it as an anomaly. Using this model we get daily counts of anomalies for each sensor.
Our sensor anomaly tool enables visual analysis of sensor anomaly data. The top left pane shows map of the park with the locations of sensors and factories marked using bubbles. The top middle pane shows a calendar heatmap with the total number of sensor anomalies on a day mapped to colour. The top right pane shows a compass of wind direction with colour of the sector mapped to the count of times the wind was in that sector. The compass also acts as a filter for wind direction. On selecting a particular sector, the calendar heatmap updates to show anomalies for the days the wind direction had the chosen range.
We represent chemicals using glyphs depicting their properties, e.g., Applumonia has a bad odour and Methylosmolene and AGOC-3A are highly volatile. These are shown on the bottom pane. On clicking on a day in the calendar, the map updates to show the number of anomalies recorded by a sensor on the size and colour of its bubble. Additionally a radial chart on the bottom right shows the hourly distribution of anomalies for the chosen day. The sizes of the sensor glyphs update to show the their individual anomaly counts. Clicking on a sensor shows the anomalies it recorded for the four chemicals. Clicking on an hour of the day displays the wind direction for that hour on the map of the park.
We found that the wind direction sensor malfunctioned on August 1, 2 and 3 as is evident from the images below. The sensors 1 & 3 showing high anomaly counts are in the opposite direction of the wind, whereas sensor 6 which is along the wind direction does not show anomalies. Further, the wind direction does not change throughout the day.
Filtering on the North East direction of wind , we find that sensor 9 is most anomalous which is obvious because wind is in its direction. For 25th August , AGOC & Methylosmolene shows more anomalies whereas for 11th December , AGOC & Chlorodinine shows more anomalies.
5. We used a Bayesian Network model to further analyse the data as follows. First, we created a new data attribute taking cosine of the angle θ_FS between a sensor S_i, i=1,...,9 , a factory, F_j, j=1,...,4 and the wind direction w. When cos θ_FS is close to 1 the wind from factory F is close to the direction of sensor S. For each time instant and sensor, we compute the cosine of this angle for each factory and take the average of the cosine values. Next we append the log (base 10) of readings from all sensors for a particular chemical in column along with an attribute Monitor which indicates which sensor the reading is from. We also include the Wind speed attribute. Using this data, shown in the image below, we compute a Bayesian Network with Reading as the target variable.
We have designed a tool to query the Bayesian as shown in the image below. The network is visualised as an n X n grid of the nodes or data attributes in the network. We show the prior distributions of the attributes along the digonal cells and pair-wise scatter plots of the data along the upper diagonal cells, with the row attribute mapped to x axis and the column attribute mapped to y. A user can perform a conditional query using this tool, by selecting desired bins of one or more attributes and pressing the query button. The posterior distributions corresponding to the conditional query are computed at the backend and the diagonal cells update to show these. We have a comparison view where the prior distributions are shown in yellow and the posterior are shown in maroon so changes can be inferred easily.
Below we show the network for AGOC-3A being queried for the anomalous condition of high wind speed, high cosine value and low reading value. We see that from the posterior distribution of the attribute monitor, that the probability of sensor 1 (monitor bin names are different from sensor names) has increased. We may conclude that sensor 1 is anomalous in the sense that even after wind flowing in its direction with high speed , it is detecting low readings.
We did the same analysis for other chemicals and found that sensor 1 is anomolous for Appluimonia and Methylosmolene as well . Whereas for Chlorodinine , sensor 9 is anomolous.
MC2.2 – Now turn
your attention to the chemicals themselves.
Which chemicals are being detected by the sensor group? What patterns of chemical releases do you
see, as being reported in the data?
Limit your response to no more than 6 images and
500 words.
1. By simply counting the number of times each chemical was detetcted by each sensor, we discovered that Applumonia and Chlorodinine are being detected equally by all sensors.
2. We computed the cosine of the angle θ_FS between a sensor S_i, i=1,...,9 , a factory, F_j, j=1,...,4 and the wind direction w. We weighted the sensor readings with these values when cos θ_FS is positive and 0 otherwise. Then we computed the cross-correlation matrix for all sensors and each chemical. We obtained the following results.
For AGOC, sensors 1 & 2 are correlated and 1 & 3 and 2 & 3 are weakly correlated.
For Applumonia, sensors 2, 3, 4 and 5 are correlated and 1 & 3 are weakly correlated.
For Chloro, sensors 3 & 4 correlated and 1 & 2, 2 & 3 and 4 & 5 are weakly correlated.
For Methylo, sensors 3 and 4 are correlated.
MC2.3 – Which factories are responsible for which
chemical releases? Carefully describe how you determined this using all the
data you have available. For the factories you identified, describe any
observed patterns of operation revealed in the data.
Limit your response to no more than 8 images and
1000 words.
In order to find out which factories are responsible for which chemical releases, we used the Bayesian Network approach described in question 1 . We make a network for wind Direction , wind Speed & readings of a particular chemical . Below you can see the network for AGOC readings on sensor 6. When we query on the high value bins of the readings , we find that the probabilty of the 252 - 288 wind direction bin increases considerably in comparison to the low probability of 36 - 72 wind direction bin. This indicates that agoc is detected with high values when the wind direction is from 252 - 288 . On further analysis , we had found that this angle in downstream with respect to kasios & roadrunner . So we can conclude that AGOC is being released from Roadrunner & kasios . On the other hand , the low probility for 36 -72 angle ( which is downstream for Radiance , Indigo ) indicates that AGOC is not released in high values from Radiance , indigo .
For chlorodinine , we found the peaks at angles which are downstream from roadrunner , kasios . So we can say that chlorodinine is being released more from roadrunner / kasios .
For methylosmolene , we found the peaks at angles which are downstream from roadrunner , kasios . So we can say that methylosmolene is being released more from roadrunner / kasios .
For applumonia we found peaks at angles which are favourable for indigo . SO , appluimnia is being released by indigo in maximum amount followed by other factories .
We had tried to link sensor data with the movement data given in MC1 in order to determine if the release of chemicals have an affect on the natural preserve or movement of vehicles . From the sensor anomlay tool , we found that appluimonia ( bad odour ) is being released in high amounts on 13th august and wind direction is blowing towards general gate 7. We calculated the count of unique vehicles at general gate 7 for different days and this count was very low for 13th august . So, we can conclude that because of bad odour near general gate 7 , less vehicles are coming there.